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Results from two studies of full QCD with two flavours of dynamical Wilson fermions are presented. At 
/3 = 5.6, the region 0.83 > > 0.56 at m^a > (0.23L) -1 is explored. The SESAM collaboration has generated 

ensembles of about 200 statistically independent configurations on a 16 3 x 32-lattice at three different K-values 
and is entering the final phase of data analysis. The T^L simulation on a 24 x 40-lattice at two ^-values has 
reached half statistics and data analysis has started recently, hence most results presented here are preliminary. 
The focus of this report is threefold: (i) we demonstrate that algorithmic improvements like fast Krylov solvers 
and parallel preconditioning recently introduced can be put into practise in full QCD simulations, (ii) we present 
encouraging observations as to the critical dynamics of the Hybrid Monte Carlo algorithm in the approach to the 
chiral limit, (in) we mention signal improvements of noisy estimator techniques for disconnected diagrams to the 
tt-N a term, and (iv) we report on SESAM's results for light hadron spectrum, light quark masses, and heavy 
quarkonia. 
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1. INTRODUCTION 

In this talk I will give an interim status re- 
port about two large-scale computer simulations 
of full lattice QCD with two degenerate flavours 
of dynamical Wilson fermions. Both simulations, 
SESAM and T%L, are based on the Hybrid Monte 
Carlo algorithm [1,2] at an inverse coupling of 
/?=-% = 5.6, running on 16 3 x 32 and 24 3 x 40 
lattices, respectively. Our platform is the parallel 
supercomputer APEIOO/Quadrics. The SESAM 
simulation took place on a 256-node 12.8 Gflops 
machine, while the T%L production is still ongo- 
ing on two 512-node 25.6 Gflops system. Our 
goal is to reveal the effects of dynamical quarks 
on physical quantities, hence we operate as close 
as possible to the chiral limit. It goes without 
saying that this task requires large lattices and 
high statistics. Therefore both technical and al- 
gorithmic improvements are crucial to achieve the 
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progress needed. As we Europeans are still liv- 
ing in the Pre-Teraflops era, we have put much 
emphasis on the improvement of numerical al- 
gorithms and the verification of their stochastic 
efficiency. I will show in this contribution that 
we (i) could boost our simulation speed substan- 
tially and (ii) perform a reliable determination 
of the autocorrelation time r of the HMC algo- 
rithm. Another important issue of our work is 
signal preparation in the computation of discon- 
nected diagrams to the tt-N a term. The last part 
of my talk is devoted to physics results on light 
hadron and quark masses and heavy quarkonia. 

1.1. SESAM 

The name 'SESAM' is our slogan and magic 
acronym in the search for Sea Quark Effects on 
Spectrum And Matrix Elements in full QCD 
with dynamical Wilson fermions. This is in- 
deed a Tera-computing task. In order to meet 
the challenge within the resources available to us 
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we decided to head for a landmark by aiming at 
high statistics at one value of (3 rather than at- 
tempting a full scaling investigation [3]. Need- 
less to say that, with Pre-Teraflops machines, it 
is highly non-trivial to position the hopping pa- 
rameter window for the observation of sea quark 
effects. 

The sea-quark masses and lattice resolutions 
in the SESAM simulation were chosen as small as 
possible in order to be sensitive to sea-quark ef- 
fects and not be disturbed by scaling violations. 
Based on experience from quenched simulations 
we aimed at lattice spacings of a w 0.1 fm cor- 
responding to the quenched (3 ~ 6.0 in order to 
make contact to the scaling region. The results 
of the SCRI-group on the full-QCD /3-function [4] 
suggested to go beyond (3 — 5.5 in order to escape 
the strong coupling regime. To end up with a rea- 
sonable physical lattice size — from valence quark 
studies at all three different K SC a's — we chose to 
work at (3 = 5.6. 

In the tuning of k towards its chiral regime, 
we aim at small values of m^/m pi under the con- 
dition that finite-size effects remain tolerable. In 
fact we required m^aL > 4. This limits the small- 
est — ratio that can be attained. We approached 
this minimal in an iterative manner, start- 
ing with our largest bare mass. We took into ac- 
count the statistics that could be achieved given 
1 year runtime on a 256 node Quadrics QH2, 
extrapolating the algorithmic specifications from 
[3,5]. Eventually we have generated ensembles at 
3 dynamical K-values, K sea — 0.1560,0.1570 and 
0.1575. Production ended in December 1996. Al- 
together, for the SESAM project, we have spent on 
APE slightly more than 100 Teraflops-hours. In 
order to put the statistics achieved with APE into 
perspective we present an overview of the charac- 
teristics of recent full QCD simulations with dy- 
namical Wilson fermions in Tab. 1 and in Fig. 1. 

The SESAM run parameters together with some 
major monitoring quantities are given in Tab. 2. 
The average number of iterations is denoted by 
# it, while N m d stands for the number of molec- 
ular dynamics steps for a trajectory, which varies 
stochastically according to a symmetric distribu- 
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Figure 1. SESAM and T%L statistics as a func- 
tion of the control parameter m^/m p , compared 
to preceding full QCD simulations with Wilson 
fermions. 



tion of width a in order to avoid interlocks with 
Fourier modes, r = ^"j"*^ is the stopping crite- 
rion for the iterative solver, alg specifies the in- 
version algorithm used, see sections 2.1. and 2.2. 
After thermalization of each K-run we have held 
the molecular dynamics parameters fixed; this en- 
ables us to carry out a sensible autocorrelation 
analysis. We found that switching k, while hold- 
ing the molecular dynamics time step fixed, is af- 
fecting the acceptance rate only marginally, see 
Tab. 2. Therefore we have used a universal time 
step of dt = 0.01 for most dynamical samples. 

It was one of the goals of SESAM to perform 
for the first time a detailed autocorrelation study 
on an adequate set of physical quantities. To 
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Table 1 



Some characteristic figures from SESAM, T%L and previously performed large scale simulations with two 
flavours of dynamical Wilson fermions. 



group 


size 


P 


K 


L s [fm] 




machine 


SESAM 


16 3 32 


5.6 


0.156 
0.157 
0.1575 


1.44(5) 
1.39(4) 
1.32(4) 


5400 
5350 
4500 


QH2 


T X L 


24 3 40 


5.6 


0.1575 
0.1580 


1.93 


2500 
2400 


QH4 


LANL[3] 


16 4 


5.4 


0.1600 
0.1610 
0.1620 


2.16 


564 
972 
239 


CM2 






5.5 


0.1580 
0.1590 
0.1600 


1.664 


301 
396 
514 


CM2 






5.6 


0.1560 
0.1570 


1.232 


601 
756 


CM2 




16 3 32 


5.5 


0.1600 




231 


CM2 






5.6 


0.1570 
L 0.1575 




318 
168 


CM2 


HEMCGC[5] 


16 3 32 


5.3 


0.1670 
0.1575 


1.744 


2425 
1270 


CM2 


SCRI[6] 


16 3 32 


5.5 


0.1596 
0.1600 
0.1604 




- 2000 

- 2000 

- 2000 


CM2 



this end we decided to archive all HMC trajecto- 
ries and study more difficult observables, such as 
glueballs and topological features, in postprocess- 
ing style using the Crays T3E, T90 and Personal 
Computers at HLRZ Jiilich. 

Simulations of full QCD with two flavours of 
Wilson fermions at zero temperature so far have 
been carried out on lattices of size < 16 3 x 32 
and ratios of ^ > 0.6 [3,5-71. The results 

m p L J J 

of SESAM demonstrate that dedicated supercom- 
puters in the range of about 10 Gflops perfor- 
mance can indeed generate in one year's run- 
time statistically significant full QCD samples 
at ~ 0.70. Alas, according to X PT (con- 

structing a fictitious pseudoscalar meson contain- 
ing two 'strange quarks', with mass ratio of the 
size quoted), 

a* „ vM = 69) (1) 



we find ourselves still in the region of strange 
quarks. Thus, in order to quantify light sea quark 
effects in full QCD, one would wish to come closer 
to the chiral limit and to finer lattice resolutions 
than achieved previously. This implies larger lat- 
tice volumes. 

1.2. T X L 

In a feasibility study on a 24 3 x 40 lattice we 
have investigated whether a further step towards 
the chiral limit is in reach of the APE Tower com- 
puting power. We tuned ourselves to a realistic 
working point at a volume of (2 fm) 3 , with chiral- 
ity characterised by — w 5.6 and ^ < 0.6 [81. 
We found that by use of preconditioning tech- 
niques we could accelerate the matrix inversion 
[9,10] sufficiently for a 512-node APE Tower to 
drive an optimised HMC code fast enough (i) to 
increase the lattice size by more than a factor of 
4 compared to the previous standards including 
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Table 2 



K 


alg 


T 


N m d ± &N md 


acc [%] 


r 


#it 


Ncsg 




# traj 


# therm 


0.156 


o/e 


1 


100 ± 20 


85 


io- 8 


85(3) 
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0.8388(41) 


5000 


400 




o/e 
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100 ± 20 


84 


io- 8 


168(5) 


8 




3500 


350 


0.1570 












0.7552(69) 








SSOR 
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100 ± 20 


80 


io- 8 


125(3) 


9 
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o/e 
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100 ± 20 


76 


10"« 


317(12) 


11 


0.688(12) 


3000 


500 


0.1575 


















SSOR 


0.5 


71 ± 12 


73 


io- 8 


150(6) 
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2000 






SESAM, (n) and to go more chiral. 

In the framework of the Italian-German T^L- 
collaboration we launched an 18 month Hybrid 
Monte Carlo simulation, mainly running on the 
APE 100 Tower at INFN Rome and partly on the 
QH4 Zeuthen/Berlin at j3 = 5.6. Our HMC im- 
plementation reaches a sustained performance of 
17 Gflops with 25.6 Gflops being the APE Tower 
peak speed. 

We switched from the thinned o/e representa- 
tion to the full representation of the fermionic 
matrix M = 1 — kD, employing our new SSOR 
preconditioning scheme, see section 2.2. On the 
APE machine, this scheme offers about an overall 
gain of a factor of 2 in machine time, see Tab. 3. 

We have chosen two k- values, 0.1575 and 0.158 
for two reasons: (i) k = 0.1575 coincides with the 
smallest mass value of the SESAM project relating 
the two lattice sizes at a definite point in param- 
eter space. Thus we are able to assess both the 
influence of finite size effects on physical quanti- 
ties and the volume dependence of the simulation 
algorithm, (it) The 24 3 x 40 lattice allows to in- 
crease the pion correlation length in lattice units, 
by a factor of 1.5 compared to the smallest 
dynamical mass of the SESAM project, which we 
estimated to be sufficient for a chirality ^f- in 
the range of .55. As the lattice scale is refined 
when reducing the bare quark masses we decided 
to stay with /3 = 5.6. 

In order to perform a scaling analysis it would 
have been desirable to simulate at different (3- 
values. The peephole to scaling, cf. Fig. 2, dis- 
closes unfortunately that a substantial decrease 
in scale a would be needed in order to have a 
sufficient lever arm in the continuum extrapola- 



tion. It is instructive though to compare Fig. 2 
to recent improved action results [11]. 




0.02 0.04 0.06 0.08 0.1 
a[fm] 



Figure 2. "Continuum extrapolation" for baryon 
masses (together with SCRI data [6]) obtained at 
(3 = 5.5. 



During the layout phase of T%L (Feb. 1996) we 
determined the most chiral K sca value extrapo- 
lating the relation m q a = \(^\ — ^ on the 
SESAM data available at the time. Requiring 
ggj = .23 we arrived at m q a — 0.023 correspond- 
ing to k = 0.1580, cf. Fig. 3. We considered this 
value of £ w as small enough not to suffer from 
large finite size effects. Needless to say: at the 
end of the day it remains to be seen, see section 
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Figure 3. Fixing K sea for T\\- from SESAM data. 



5, that this parameter choice is reasonably posi- 
tioned within the 'chirality gap'. At the time of 
this meeting, we have generated two ensembles 
of configurations at k — 0.1575 and k — 0.158, 
with more than 2400 trajectories each, within 150 
Teraflops-hours. For T%L we were not able to 
archive each trajectory so we decided to store ev- 
ery other for later autocorrelation analysis and 
detailed postprocessing. 

The T%L run parameters and some monitoring 
quantities of interest are collected in Tab. 3. t 
specifies the time to generate one trajectory on 
the APE Tower. 

2. OPTIMIZING HMC 

The optimisation of fermionic simulation algo- 
rithms constitutes a major target of the scientific 
program of the Wuppertal-HLRZ lattice QCD 
group. From the beginning the two projects, 
SESAM and T%L, have been accompanied by algo- 
rithmic research in collaboration with the Applied 
Mathematics group at Wuppcrtal University. We 
have focused on the acceleration of Krylov sub- 



space solvers within the computer intensive in- 
version part of HMC that is required to calculate 
the fermionic force [12,13]. In these studies the 
so-called biconjugate gradient stabilised method 
(BiCGstab) has been established as the most ef- 
ficient Krylov solver for Wilson fermion matrix 
inversions. It behaves quasi- optimal, i.e., it ap- 
proaches the convergence speed of GMRES, the 
(non-practical) optimal reference Krylov solver. 
Therefore, to make further progress, one has to 
address the development of new parallel precon- 
ditioning methods [9,14]. 



2.1. Krylov solvers 

The practical iterative methods to solve the 
huge system of equations MX — <f) belong to 
the class of Krylov subspace methods. In the old 
days the minimum residual algorithm (MR) has 
been established as the efficient method for both 
propagator computations and the calculation of 
the fermionic force within the HMC [15]. In the 
early nineties, numerical analysts were success- 
ful in developing new Krylov subspace methods 
[16,17]. These methods avoid the squared condi- 
tion number (from M^M inversion) of CG, and 
yet guarantee convergence as opposed to MR. 

In Refs. [12,13,18] we have benchmarked and 
improved various iterative solvers (within simpli- 
fied settings) in order to find the fastest solver for 
Wilson fermion matrix inversions. 

Using SESAM's configurations we can confirm 
the findings of [12]. The over-relaxed MR algo- 
rithm outperforms CG, however BiCGstab beats 
ORMR, see Fig. 2.1. This behaviour is qualita- 
tively the same for all three K sca -values used by 
SESAM. 

Comparing the convergence behaviour of 
BiCGstab with that of OrthoMin(N) reveals that 
BiCGstab is close to the optimal algorithm GM- 
RES. GMRES orthonormalises on all previous 
search directions and therefore is not practical. 
OrthoMin(N) [19] is a practical modification of 
the latter, orthogonalising on N previous di- 
rections only. The insertion demonstrates that 
BiCGstab is beating OrthoMin up to a depth of 
N = 10 and further on is about 10 % less efficient 
in terms of iterations. In view of these findings, 
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Table 3 



Simulation parameters and some monitor quantities for the T%L runs. 



K 


alg 


T 


N m d ± <?N md 


r 


t/traj [s] 


7£ [Prel!] 


# traj 


# therm 


0.1575 


o/e 
SSOR 


1 

0.5 


125 ± 20 
125 ± 20 


IO" 8 

io- 8 


8200 
3800 


0.70(4) 


2500 


500 


0.158 


SSOR 


0.5 


125 ± 20 


io- 8 


9200 


0.56(4) 


2400 


800 
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Figure 4. Comparing the convergence of the sim- 
ple Jacobi iteration, ORMR and BiCGstab at 
j3 = 5.6 and k = 0.1575 for a typical 16 3 x 32 
full QCD configuration. The insertion shows the 
results for OrthoMin(N) compared to BiCGstab 
(straight line). 



BiCGstab can be considered as quasi optimaP. 

2.2. Parallel preconditioning 

The quasi-optimality of BiCGstab suggests to 
turn attention on multigrid methods and/or pre- 
conditioning in order to achieve further speed up 
in Wilson fermion matrix inversions. The applica- 

2 In terms of computer time expense, BiCHstab is more 
efficient. 



tion of multigrid techniques is impractical, how- 
ever, due to the gauge noise of the gluonic back- 
ground field entering the fermion matrix. Thus 
preconditioning techniques, i. e. methods to de- 
crease the condition number K 2 of M'M appear 
to be the only promising path to further acceler- 
ate Krylov subspace solvers like BiCGstab. 

A standard preconditioning approach in lattice 
gauge computations rests upon o/e decomposi- 
tion of M [20]. It can yield an efficiency gain 
by a factor of 2 when inverting M. Some years 
ago, Oyanagi [21] exploited incomplete LU (ILU) 
factorisation of the matrix M based on the nat- 
ural, globally lexicographic ordering of the lattice 
points 3 . On local memory or grid-oriented paral- 
lel computers, this preconditioner can hardly be 
implemented efficiently, however. 

Recently we have introduced a new paral- 
lel preconditioning technique suitable for Wil- 
son fermion matrix inversions. Our method 
is called local lexicographic SSOR preconditioner 
(LL-SSOR). As opposed to familiar multicolour 
SSOR preconditioners (like the o/e precondi- 
tioner) which produce a decoupling of variables 
on a very fine grain level, the LL-SSOR method 
provides the flexibility to reduce the decoupling 
to the minimum which is necessary to suit a given 
parallel system. As for any SSOR preconditioner, 
the Eisenstat Trick [22] is crucial for the efficient 
implementation of LL-SSOR. Our numerical ex- 
periences show that LL-SSOR presently offers the 
fastest available solution method on parallel com- 
puters. 

The general preconditioning of M proceeds via 
two non-singular matrices V\ and V2 , the left and 



3 For the Wilson fermion discretisation with Wilson pa- 
rameter r = 1, ILU preconditioning is identical to sym- 
metric successive over-relaxation (SSOR) preconditioning 
with respect to that ordering. 
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right preconditioners, respectively: 

V^MV^x = ij>, 

where <f> = V-^ 1 ^, x = V 2 x. (2) 

A Krylov solver could now be applied directly, by 
replacement of each occurrence of M and 4> by 
V 1 ~ 1 MV 2 ~ 1 and 4>. 

We have chosen to apply symmetric Gaufi- 
Seidel (SSOR) preconditioning. The matrix M 
has to be decomposed into its diagonal, strictly 
lower and strictly upper triangular parts, M = 
I — L — U . Then the SSOR preconditioner is 
specified by 

Vi = I — L, V 2 =I-U. (3) 

Now we find that V 1 +V 2 - M = I. This re- 
lation allows to exploit the Eisenstat-trick [22]: 
V^ 1 MV 2 ~ 1 = KT 1 + Vf 1 ^ - V^ 1 ), so that the 
matrix vector product w = V{~ MV 2 r amounts 
to a 2-step computation 

v = V 2 ~ r, u = (r — v), w = v + u. 

(4) 

The general preconditioned BiCGstab procedure 
is described in Ref. [9]. 

Since the matrices I — L and I — U are trian- 
gular, their inversions are simply performed by 
forward and backward substitution, respectively. 
As example the forward solve (I — L)y = p be- 
comes: 

for i = 1 , . . . , n 

i-1 

{y)i = (p)< + ^ L h(v)j- 

3=1 

In terms of computational cost, a forward fol- 
lowed by a backward solve is as expensive as a 
multiplication with M. Hence in principle there 
is no increase of computational cost with SSOR. 

In the solution of the Wilson fermion inver- 
sion problem the ordering scheme for the lattice 
points x is completely arbitrary. Different order- 
ings yield different matrices M, permutationally 
similar to each other. The efficiency of the SSOR 
preconditioner depends on the ordering scheme 
chosen. 



Consider an arbitrary numbering (ordering) of 
the lattice points. For a given grid point x, the 
corresponding row in the matrix L or U contains 
exactly the coupling coefficients of those nearest 
neighbours of x which have been numbered before 
or after x, resp. Therefore, a generic formula- 
tion of the forward solve for this ordering is given 
by Algorithm 1. The backward solves are done 
similarly, now running through the grid points in 
reverse order. 

Algorithm 1 Forward Solve. 

for all grid points x in the given order 
{ update y x } 
Vx = Px 

for \x = 1 , . . . ,4 

if x — n was numbered before x then 
Vx=Vx + K- mt,x-y.y*-» 
for (jb = 1, . . . ,4 

if x + n was numbered before x then 
Vx=Vx + K- m- x+li y x+li 

So far, odd-even preconditioning was consid- 
ered as the only successful preconditioner for lat- 
tice QCD that is parallelisable. For this particu- 
lar ordering the inverses of I — L and I — U can 
be determined directly, see Ref. [9]. 

Oyanagi some time ago [21] has demon- 
strated that ILU (SSOR) preconditioning, apply- 
ing global lexicographic ordering, yields an im- 
provement over odd-even preconditioning as far 
as the number of iterations is concerned. How- 
ever, its parallel implementation is difficult [23]. 

The ordering we have proposed is similar to 
Oyanagi's approach, however, it is intrinsically 
parallelisable, and it is adaptive to the parallel 
computer used. We assume that the processors 
of the parallel computer are connected as a p\ x 
P2 x P3 x pi 4-dimensional grid. The space-time 
lattice can be matched to the processor grid in an 
obvious natural manner, producing a local lattice 
of size n l ° c x n 2 oc x n 1 ^ x n L f c with n l ° c — rn/pi 
on each processor. 

The whole lattice is divided into n loc — 
n ioc n ioc n ioc n ioc g r0U p S> Each group corresponds 

to a fixed position of a site in the local grid. As- 
sociating a colour with each of the groups, we can 
interpret this process as a colouring of the lattice 
points, see Fig. 5. 
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On Quadrics we have achieved a real overall speed 
up of a factor between 1.5 and 2.1 compared to 
our o/e implementation, sec Tab. 3; however, this 
is a machine dependent result. 



Algorithm 2 11-forward. 

for all colours in lexicographic order 
for all processors 

x := grid point of that colour on that processor 
{ update y x } 



yx — yx 



E 



H, x — n <u x 



M, x + n < u x 



Figure 5. Locally lexicographic ordering and for- 
ward solve in 2 dimensions. 



All nearest neighbours of a given grid point 
have colours different from that point. Perform- 
ing the forward and backward solves within the 
BiCGstab algorithm, grid points having the same 
colour can be worked upon in parallel, thus yield- 
ing an optimal parallelism of p, the number of 
processors. 

A formulation of the ^-forward solve is given 
in Algorithm 2. Here, we use '<;;' as a symbol 
for 'ZZ-less than'. 

On the 'local boundaries' we will have between 
(for the ll-ftvst point) and 8 (for the ZZ-last 
point) summands to add to p x . The parallelism 
achieved is p, and thus is optimal since we have 
p processors. If we change the number of proces- 
sors, the ^/-ordering, and consequently the prop- 
erties of the corresponding SSOR preconditioner 
will change, too. 

The efficiency of LL-SSOR has been tested in 
the framework of SESAM and T%L simulations. 
First we display performance results for k = 0.157 
on an 8 3 x 16 lattice at (3 = 5.6. In Fig. 6 we show 
that the convergence speed of LL-SSOR is about 
twice as fast as that of o/e preconditioning and 
nearly reaches that of Oyanagi preconditioning. 
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Figure 6. Residue reached by a given number of 
matrix- multiplies . 



We also can present results for the volume de- 
pendency of the LL-SSOR preconditioning, see 
Fig. 7. 
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Figure 7. Volume dependency of LL-SSOR effi- 
ciency for BiCGstab. 



Finally we remark that preconditioning and 
chronological start vector guess (CSG) [24] as ap- 
plied in the molecular dynamics evolution part of 
HMC are nearly additive in their iteration gain. 

3. CRITICAL DYNAMICS 

The HMC algorithm is a Markov process at 
heart. Therefore, a full QCD vacuum configura- 
tion series exhibits autocorrelation. Its statistical 
quality is affected crucially by this autocorrela- 
tion which depends in general on the physical ob- 
servable under investigation and are increasing in 
the approach to the chiral limit. 

The exact determination of the autocorrelation 
function amounts to trace the system over infi- 
nite time. Any practical Markov process is fi- 
nite, however, and it is so much the worse that 
tuc m full QCD Hybrid Monte Carlo simulations 
is not large enough compared to the relaxation 
time, tuc — i~ exp . So far this has prevented a 



reliable determination of autocorrelations in full 
QCD simulations with Wilson fermions. 

SESAM has increased the trajectory samples by 
nearly one order of magnitude compared to pre- 
vious studies 4 . It is important to note that we 
rested under stable conditions for the HMC dy- 
namics to evolve rather than retuning MD pa- 
rameters as production went on. This provides 
the setting for a reliable determination of auto- 
correlation times related to various gluonic and 
hadronic quantities. 

Given a time-series of measurements A t , t = 
l,...,tMC we compute a finite time-series ap- 
proximation to the true autocorrelation function 
for observable A: 

, tMC—t 

C A (t) = - V A s A s+t 



tuc — t 



ttAC 



tMC 
S = l 



A, 



(5) 



This two-point function in the Monte Carlo time 
can be normalised by 



P A (t) 



c A (t) 

C A (0)' 



(6) 



The exponential autocorrelation time is defined 
as the inverse decay rate of the slowest mode con- 
tributing to the autocorrelation function: 



' exp 



limsup ~~T7Tv 

t^oo - log p A (t) 



(7) 



T^ xp is a relaxation parameter and is related to 
the length of the thcrmalization phase [25] of the 
Markov process. In our simulations we required 
the thermalization length to be 5 x r exp such as to 
achieve a suppression of the starting conditions of 
order C(cxp(— 5)) in the ensemble. Furthermore 
T^ xp is a characteristic time to achieve ergodic- 
ity: the simulation has to be much larger than 

suPaKIpI- 

The integrated autocorrelation time is defined 

as the integral: 



int 



(8) 



4 TxL aims at 0(4000) trajectories per dynamical sample. 
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In equilibrium r^ t characterises the true statisti- 
cal error of the observable A computed from the 
ensemble. The sample mean 

1 N 

W = nT, A ^> ( 9 ) 
i=i 

has the variance 

a A ~ N _ 1 2r tt C A (0) = 2r^ t cro, 

for iV»r^, (10) 

which is increased by the factor 2r A t compared 
to the result over a sample of ./V independent 
configurations. In our simulations, configura- 
tions separated by 2 to 3 x r^ t are considered 
as 'decorrelated' 5 . 

On a finite time-series it is difficult to estimate 
the slowest exponential autocorrelation time re- 
liably as the tail of the autocorrelation function 
becomes compatible with zero. This leads to an 
unwanted bias in T^ t . A practical solution to 
this problem is the application of a 'window' pro- 
cedure as introduced in Ref. [25] to extract the 
integrated autocorrelation time: a cut-off t in the 
sum for r^ t is increased until a plateau becomes 
visible. As a rule of thumb, it has been suggested 
in [25] to determine t self-consistently in the range 

4 to 10r^ t . This amounts to a truncation effect 
(difference to the true tm) of less than 2%. 

The integrated autocorrelation time T int can be 
observable dependent. One effect is due to the 
time-space extension of lattice observables. Very 
extended quantities on the lattice might exhibit 
larger Ti nt . It can be shown in free field models 
that autocorrelation modes are related to lattice 
symmetries like e. g. translation invariance, and in 
this way to correlations on the lattice itself. As 
a consequence long range correlations across the 
lattice, as they occur for light masses, go along 
with larger autocorrelation times. 

In the following we determine the autocorrela- 
tion from a variety of observables. This is facil- 
itated because we have archived all trajectories 

5 Residual autocorrelation between successive measure- 
ments arc taken into account by binning the data prior 
to statistical analysis [30]. 



of the SESAM-simulation and every second tra- 
jectory of the TxL-simulation for a detailed post- 
processing. 

We illustrate the numerical impact of finite 
sampling on the estimate of autocorrelation in 
Fig. 8. One observes the long range autocorre- 
lation modes to emerge out of the noise level as 
the sample is enlarged. There is a certain thresh- 
old where the autocorrelation function ceases to 
be compatible with zero and becomes visible. 

Besides the average plaquette Su , we computed 
the autocorrelation of extended smeared observ- 
ables that exhibit a large ground state overlap 
per construction. This corresponds to rather long 
range correlations on the lattice. We consider the 
Gaussian smeared light meson masses, and 
to p , and smeared spatial Wilson and Polyakov 
loops built iteratively from fuzzed links of the 
form: 

U?(x) = Uj(x) 
U? +1 (x) = U?(x)U?(x + 2 n j) 

+ £ U?(x)U?(x + 2 n i) 

K|^j;i=±l,...,3 

x U?(x + 2 n (i+j)) 

x Ul l+ (x + 2 n+1 j). (11) 

n labels the smearing level. We investigate these 
quantities for n = 0,1,2,3 corresponding to 1 x 
1,2x2, 4x4, 8x8 Wilson loops, respectively. 

Another 'fermionic' monitoring quantity is the 
inverse of the average number of iterations N^ y 
of the Krylov solver. For the conjugate gradient 
(CG) it has been demonstrated in Ref. [31] that 
this quantity is related to the square root of the 
ratio of the minimal to the maximal eigenvalue of 
a hermitian positive definite matrix H, i.e., its 
condition number K = . This is motivated 
by the following considerations: let us start from 
the definition of «g Ca as the value of the hopping 
parameter, k, at which the pion mass vanishes 

^ocm^iQ-^). (12) 

The convergence rate of the conjugate gradient 
algorithm, given a residual norm r, can be ex- 
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Figure 8. For k sc& = 0.157, measurements on 
samples of increasing size are shown for the aver- 
age plaquette. 



tracted from the bound [32] 



r < 2 



( vK, 



Va/aT. 



Ncg 



(13) 



This inequality provides an estimate to the con- 
vergence behaviour of CG on a hermitian matrix 
H. The relation is based on the assumption of a 



uniform distribution of eigenvalues of H. Gener- 
ally one expects some dependence of r on the de- 
tailed distribution of eigenvalues, however. Close 
to k c the minimal real eigenvalue of M^M is small 
and is approximately that of M 2 , i. e., 



r < 2 - ANca 



A, 



An 



O 



(14) 



For the case of a poor condition number, 
1, we can exploit this relation and find for r fixed: 

Amir?. 



A oc 



(15) 



We know empirically that the ratios of conver- 
gence rates of BiCGstab and CG are quite con- 
stant over a rather large range of n. This sug- 
gests to utilise the convergence rate of BiCGstab 
as an indicator to the Monte Carlo evolution of 
the smallest eigenvalue of the fermion matrix. 
Since small eigenvalues correspond to large cor- 
relation lengths, X m i n presumably projects max- 
imally onto the slowest relevant autocorrelation 
mode of the system. 

In Fig. 9 the integrated autocorrelation times 
of si and ss-smeared n and p 'masses' are shown. 
Here we take the 'mass' as computed from the 
propagator of each individual trajectory as an es- 
timate. Autocorrelations of various gluonic ob- 
servables (Wilson loops and Polyakov loops) are 
displayed in the lower part of the figure. In both 
pictures A, as estimated by Eq. 15, is at the upper 
end for the autocorrelation times 6 , together with 
the 8x8 Wilson loop. 

The plot illustrates the quality of the sig- 
nal. It is evident that the exponential auto- 
correlation times are bounded by the minimal 
eigenvalue. For all reference quantities we ob- 
served clear plateaus in n n t(t). The Wilson and 
Polyakov loops provide evidence that geometri- 
cally extended quantities indeed suffer more from 
autocorrelation effects. 

So far, we have compared various observables 
at one value of K sea , K sca = 0.157. In the following 



6 Slower modes might exist for the topological charge, but 
seem to be decoupled from the observables of interest. 
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Figure 9. For K soa = 0.157, wc demonstrate that 
the integrated autocorrelation time of A is an up- 
per bound for the autocorrelation times of all 
'fermionic' and 'gluonic' observables measured. 



Fig. 10, the sea-quark dependency of the expo- 
nential autocorrelation function is illustrated for 
three different observables: the plaquette action, 
the sl-smeared pion mass and the spatial Wilson 

loop, 0(8,8). 

The smeared quantities exhibit an early asymp- 



totic behaviour while the un-smeared average pla- 
quette appears to couple to many excited modes. 
The integrated autocorrelation times, Ti n t(t), of 
the pion masses and the spatial Wilson loops as 
a function of the cut-off t each reach an asymp- 
totic plateau for t, ni < 1/At within the estimated 
errors. Assuming a single exponential to domi- 
nate p A (t), the systematic error is w 2% and thus 
smaller than the statistical uncertainty. 

A compilation of all integrated autocorrelation 
times can be found in Fig. 11 
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Figure 11. The sea quark-dependency of the es- 
timated integrated autocorrelation times. 



Originally we believed that the restriction of 
the pseudo-fermionic d.o.f. to the even sub-space 
[20], as offered by the o/e preconditioning scheme 
for M, would affect the dynamics of HMC only 
marginally, such that we have chosen this option. 
In the LL-SSOR scheme, however, the pseudo- 
fermion field has to live on the entire lattice. 

It is now very interesting to compare autocor- 
relation on the full and the o/e reduced system, 
see Fig. 12: we show the integrated autocorre- 
lation time of A for K sea = 0.1575. On the fully 
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Figure 10. Autocorrelation functions and integrated autocorrelation times for the quantities: £ti,7r, 8 S 
for all three sea quark masses. The 2% curve is plotted for orientation. The horizontal lines indicate the 
values of exponential and integrated autocorrelation times. 
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occupied system, we have employed the SSOR al- 
gorithm and have generated trajectories of half 
length, T = 0.5. One would have anticipated the 
autocorrelation time to increase by a factor of 2 
in trajectories. We found, however, a factor of y/2 
only compared to the o/e reduced system's value. 
Thus, doubling the number of pseudo-fermionic 
degrees of freedom appears to affect autocorre- 
lation to the extent that higher stochasticity de- 
creases the autocorrelation. 

The lessons to be learnt here are: the question 
of optimal trajectory length of the HMC has to be 
addressed anew and thinning out fermions turns 
out to be counter-productive. 








20 40 60 80 100120 140 
t 



Figure 12. The integrated autocorrelation time of 
the averaged number of BiCGstab iterations for 
K sca = 0.1575. 



elude that the autocorrelation times come out 
much smaller than anticipated previously. The 
exponential autocorrelation times all are well be- 
low the value of 50 trajectories. With the knowl- 
edge of Ti n t , we can assess the computational cost 
to generate one independent full QCD vacuum 
state. Taking the sustained performances reached 
in the o/e and LL-SSOR preconditioned HMC 
(67% and 38%, respectively) we estimate these 
numbers in Gnops-hours units, see Tab. 4. More 
detailed results will be given in [33]. 

4. FLAVOUR SINGLET OPERATORS 

The computation of nucleonic matrix elements 
of flavour singlet operators involves notorious 
noise problems when dealing with the contribu- 
tions of disconnected quark diagrams (DQD). 
This presents a headache in particular, when 
analysing full QCD vacuum configurations for sea 
quark effects, given the limitations in the statis- 
tical sampling. 

The common technique to handle DQD is the 
stochastic estimator method. SESAM has paid 
particular attention to devise methods for sig- 
nal improvements, starting out from the work of 
the Tsukuba and Kentucky groups [27,26] in the 
quenched situation [28]. 

The most simple case of DQD to consider is 
the 7T-N (T-term, cr^r, which amounts to deter- 
mine the correlator between the nucleon propa- 
gator, P, and disconnected closed quark loops, 
Tr(M~ r ). According to the Kentucky technique 
the latter are estimated by inverting the Dirac 
operator on a stochastic source, with Z2 noisy 
entries in all space-time and spin-colour compo- 
nents of the source vector. 

The quantity to measure is given by the asymp- 
totic slope in time t, in the expression for the 
correlator 



R(t) 



disc 



t large 



(P(0 -» t)Tr{M~ 1 )) 
const + t(P\qq\P) l £& 



(Tr(M-i)) 



(16) 



For the algorithmic part of this talk we con- 



which is prone to noise problems from the very 
subtraction on the entire space-time volume. It 
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Table 4 



CPU cost for the generation of full QCD vacuum states from our Quadrics implementation. 



Lattice 16 3 32 at = 5.6 


ftsea 


VMGcV] 


L s [im} 




tti-k /m p -ratio 


QH2[h] -> GFlopsxh 


0.156 (OE) 


2.19(8) 


1.44(5) 


0.4482(40) 


0.8388(41) 


w 5 -» w 43 


0.157 (OE) 


2.25(6) 


1.39(4) 


0.3412(33) 


0.7552(69) 


w 22 -» w 189 


0.157 (SSOR) 


11 


11 


11 


11 


wl7-» w 82 


0.1575 (OE) 


2.38(7) 


1.32(4) 


0.2763(29) 


0.688(12) 


w 45 -» w 387 


0.1575 (SSOR) 


11 






11 


w 35 -> « 170 




Figure 13. New plateau sampling technique for 
the computation of disconnected contributions to 
the tt-N a term in full QCD. 



is obvious that loops located far away from P(0 — > 
i) in time will add in particular to this noise 
level. But the highly separated terms (in time) 
are expected to be decorrelated from the nuclcon 
propagator! The obvious measure to counteract 
the loss of signal is to confine the noisy source 
into the time interval of < t < t. This new 
plateau sampling technique leads to a substantial 
improvement of the signal to crjv, as can be seen 
from Fig. 13. The variance turns out to be re- 
duced by almost a factor 2, see Fig. 14. 

This result is very encouraging in view of other, 
more complicated matrix elements, such as the 



Figure 14. Same as Fig. 13. The source is spread 
over the entire lattice. 



flavour singlet axial vector current which is of 
tantamount importance in proton spin studies. 
In this case a weak signal has been observed on 
a large quenched sample by the Tsukuba group 
some time ago [27]. The QCDSF collaboration, 
however, could not see a signal above the noise 
level, within their high statistics study of the pro- 
ton structure functions [29]. It is a challenge to 
look for progress in the signal preparation. 

5. LIGHT HADRON SPECTRUM 

With three SESAM samples at = 5.6 (and one 
T%L sample at K sca = 0.1575) we can perform the 



1G 



chiral extrapolation in K soa for hadronic observ- 
ables. We try to estimate the influence of dynam- 
ical fermions by comparison with quenched QCD 
at the equivalent scale. At each sea-quark sample 
zero-momentum two-point functions for mesons 
and baryons (cf. Tab. 5) are computed for vari- 
ous valence hopping parameters Ky. Altogether 



Table 5 

Hadron operators. 



meson 




pseudoscalar 

vector 

scalar 

axial- vector 


Xps(x) = q'(x)j b q(x) 
X ^(x) = q'{x)Yq{x) 
XSc(x) = q'{x)q(x) 
XAx{x) = q'{x) lb ^q(x) 


baryon 


X{x) 


nuclcon 
A 


Xn(x) = e abc {q a C^ b q b )q c 
Xa( x ) = £abc{qaC^q b )q c 



we work with fifteen mass estimates per K soa , 
Tab. 6. 



Table 6 

Valence kappa values. 



sec 



/^sca 


{ny} 


0.156 


{0.156, 0.157, 0.1575, 0.158, 0.1585} 


0.157 


{0.1555, 0, 156, 0, 1565, 0.157, 0.1575} 


0.1575 


{0.1555, 0, 156, 0, 1565, 0.157, 0.1575} 



We want the propagators to be dominated 
by the lightest state at small time separations. 
Therefore we apply Wuppcrtal smearing with 
smearing parameter a — 4 and 50 smearing itera- 
tions. Smeared-smeared (ss) and smeared-local 
(si) correlators are used in simultaneous mass- 
estimates. We fit to the data on time-slices 10 
to 15 (see Ref. [30] on how to determine the fit 
ranges and all that). Autocorrelation times of 
Ti n t < 25 trajectories have suggested to calcu- 
late propagators on configurations separated by 
25 HMC trajectories, the residual correlation is 
taken into account by jackknifc binning. 



We perform single-exponential fits: 
C(t) m , 

C(t)ba 



= A(e- ,nt + e- m ( T -V), 
Ae- mt . 



(17) 



The effective masses are determined iteratively 
for the mesons, by solving the equation 



c AB (t) 

C AB {t+l) 

p -m e ff(t)t 



-m eft (*)(T-t) 



e -m eff (t)(t+l) _|_ g-m 6ff (t)(T-t-l) ' 

and directly for baryons, 

Cab (t) 



m e ff (t) = log 



C AB (t + l)' 



(18) 



(19) 



Light Mesons 

We carry out linear extrapolaions in the lattice 
quark mass, using data with ny = K soa , generi- 
cally called to ss : 



TO PS,ss 



= c 



my ss = m 



1 



+ bm 



1 



(20) 



PS,ss 



These fits, called "symmetric", are shown in 
Fig. 15. We find the pseudoscalar mass to be well 
matched by the linear ansatz (with a x 2 /d.o.f = 
0.002), whereas the vector masses may exhibit 
some downward curvature (based on a % 2 /d.o.f = 
1.1). The lattice spacing is obtained from M p as 
physical input 7 . The results are quoted at both 
the chiral limit and the experimentally given mass 
ratio, defining n 1 ^ 1 : 



0.1785. 



(21) 



The lattice spacings from the p become 
a- 1 = 2.35(6) GeV at < ea , 



a^ 1 = 2.33(6) GeV at k. 



light 

sea ' 



(22) 



with 



< ea = 0.15846(5), = 0.15841(5). 



(23) 



7 Continuum masses go with capital "M", whereas lattice 
masses arc written as "m". 
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k through -4r = i 



6.35 6.40 



where «i and k£ 



refer to valence quarks in a meson. 

In Ref. [34], we have introduced a notation to 
specify these valence quarks with masses differ- 
ent from sea quark masses. m 5S refers to both 
m v 's equal m sea of the sample, m sv means only 
one m v equal to m sca , and m vv stands for neither 
m v equal to m sca . We argued that mp S an d 
m™ „, as linear functions of and — can be 
parametrised by the two sets of equations: 




m 



(24) 



0.6 



0.5 



0.4 — 



0.3 




0.00 0.02 0.04 0.' 

m s -(lA eff = I/O 




m 



m 
rn 



'PS.ss 
2 

PS.sv 
2 

PS,vv 



(25) 



with C34 = 2c 3 and C13 = 2cg — c and m v = 
i ( -4ir s— ) • A combined linear fit of all the 

Z \ K v K sea / 

pscudoscalar data with the ansatz of Eq. 24 leads 
to an acceptable x 2 /d.o.f = 4.4/23. For further 
details and the vector meson fits we refer to [30] . 
We determine K stran s° from m sv by matching 



m-ir flight ^.strange 



) _ M, 



K* 



m v ,ss(Ki 



light N 

sea ) 



1.16 



(26) 



Figure 15. TTLpg ss clS 3- function of — =— and my,., 
as a function of m sea (in lattice units). 



Strange Mesons 

For want of 3 dynamical quarks, we have to 
find a way to compute strange mesons in a sea 
of two light mesons. This requires valence quarks 
different from sea quarks. We define an effective 



where k^ 1 * is given by Eq. 23. Alternatively, 
^strange is computed from m vv matching 



my 1 ,w(K^ t ,« Btrap « e ) 

TO V 2 ,ss( K s'c?a ) 



M„ 



= 1.326 



(27) 



In this manner we can calculate the masses of 
the K (K*) and the <p composed of the appro- 
priate light and strange quarks in a sea of light 
quarks. 



18 



Baryon Masses 

The remaining independent quantities to de- 
termine after fixing Kg Ca via pseudoscalar meson 
mass and a via p mass are the masses of nucleon 
and A. We perform the chiral extrapolation by 
fitting to a linear function, the systematic error 
is taken from quadratic 3 parameter fits. 

In Fig. 16 we show an overview of light and 
strange hadron masses comparing our three sim- 
ulations (SESAM, T^L, and quenched) 8 . 7r and p 
are used to fix scales and chiral limit and there- 
fore must coincide with experimental values. For 
the A, sea quark effects seem to be visible, while 
for the nucleon there is no observable difference to 
quenched simulations. It remains to be seen how 
these findings are affected including the k — 0.158 
sample from T%L. As expected finite volume ef- 
fects (comparing extrapolations on SESAM data 
with T%L data for k = 0.1575) are largest for nu- 
cleon and A (around 5 to 8 % ). The conversion 
to physical units slightly softens this effect since 
the p is some 2 to 3 % lighter on the 24 3 lattices. 
From the statistical accuracy achieved a sensible 
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Figure 16. The light mass spectrum in the 'simul- 
taneous' chiral limit. 



continuum extrapolation appears to be in reach 
of Tera-computer performance! 



6. LIGHT QUARK MASSES 

Light quark masses-albeit being fundamental 
parameters of the Standard Model-are not di- 
rectly accessible to experiment. 

In principle, lattice methods allow to compute 
their absolute values from the QCD Lagrangian, 
using the values of hadron masses as physical in- 
put [35-37]. Wilson fermions appear appropriate 
here as-unlike staggered fermions-they discretise 
full QCD with correct flavour attributes. 

In a recent analysis of world data, unquenching 
(from Nf = to Nf = 2) seems to lower the 
values of light and strange quark masses by about 
50 % [37]. To clarify these indications we have 
investigated, within the SESAM project, the mass 
renormalization of the light and strange quarks 
under the influence of two dynamical flavors [34] . 

In order to extract the masses of the light and 
the strange quark from our meson data the lattice 
results have been extrapolated to the experimen- 
tal mass ratios Eq. 21, Eq. 27, and Eq. 26. At 
each sea-quark sample we evaluate meson masses 
with strange valence quarks and perform an ex- 
trapolation of these masses to the physical sea of 
light quarks. 

Under the condition that we have to treat the 
u-d isospin doublet as degenerate we can extract 
the masses of the degenerate light quarks in a sea 
of degenerate light quarks as well as the mass of 
the strange quark. 

We take the values from our light spectrum 
computations and convert to physical units in the 
M5-scheme according to: 



"MS 



(p) = Z M (na)m sca a p 



(28) 



3 The T%L data are preliminary (ft = 0.1575 only). 



Zm{i^o) is computed in boosted 1-loop perturba- 
tion theory [35,39]. Finally we run the values to 
2 GeV, see Tab. 7. 

We remark that one could have determined 
«.strangc by matching the ratio ^f- using the 
symmetric fit only [37], see Tab. 7. This re- 
sult, 77^ ngo (2GeV) = 80(8) MeV turns out to 
be much smaller than the value found before, 
m^ nge (2GeV) = 140(20) MeV. However, the 
would consist of strange valence quarks under the 
influence of a sea of strange quarks, which is a 
poor description of the physically correct situa- 
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Table 7 

K-values and corresponding light quark masses. The scale is taken from the p: a~ 1 2.33(6) GeV at fi^fa*- 





K 


m w (2GeV) 


N f = 2 


light 


0.15841(5) 


2.7(2) MeV 


strange 


0.15615(20) stat (20) s y st 


140(20) MeV 


strange sea 


0.15709(12) 


80(8) MeV 


N f = 


light 




5.5(5) MeV 


strange 




166(15) MeV 



tion of two light dynamical quarks. 

Comparing our results to the quenched values 
at corresponding /3 qU enched = 6.0, we observe a 
much smaller dynamical light quark mass [37]. 
m strange /m hght s=s 52, whereas the strange mass is 
compatible to the quenched value within errors. 

Let us comment on the dramatic change in the 
light quark mass due to unquenching. We can 
define a quark mass at fixed sea-quark: 



Setting wips w ( K v) = at Ky ^ forces the 
'pion' to become massless at «y. 

In the manner of quenched computations we 
measure a 'bare light quark mass' at each of the 
three sea-quark values. A chiral extrapolation of 
these quark masses in the sea-quark to the chi- 
ral point would yield the value A 2 in Fig. 17, 
while the true value (from a pion at a physical 
sea-quark) is given by Ai < A 2 . We might try to 
solve this Ai-A 2 -discrepancy and extrapolate A 2 
to the location of the light sea-quarks. However, 
either we give up working at the physical pion 
mass — as the critical kappa -\- would become too 
low otherwise — or again the 'quark mass' given in 
this way is too large by about a factor 2. As these 
'light quark masses' are very similar to that of the 
quenched simulation (5.7(4), 5.6(3), 5.4(3) MeV), 
the conclusion is that it is not possible to estimate 
the light quark mass from quenched computations 
since the light quark mass known by symmetric 
extrapolation cannot be recovered computing in 
valence quark style at fixed sea quark mass, with 
subsequent extrapolation in m sca . 



1/K 




1/ K sea 



Figure 17. Schematical plot of — and t A ht vs. 

1 



7. HEAVY QUARKONIA 

As has been reported by C. Davies in her con- 
tribution to this workshop, nonrclativistic QCD 
(NRQCD), an effective theory for the low energy 
regime of heavy quarkonia, has proven to be an 
efficient tool to directly calculate bottomonium 
on the lattice [40]. Full QCD simulations using 
dynamical staggered quarks have shown the sen- 
sitivity of fine and hyperfine splittings to vacuum 
polarisation. Lattice observables extrapolated to 
Nf — 3 have turned out to be in remarkable 
agreement with experimentally known quantities, 
thus unknown quantities can be predicted with 
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some confidence. 

In this section, we present preliminary results 
of our systematic study of lattice NRQCD for bb 
systems with dynamical Wilson quarks at three 
different masses, an NRQCD action that includes 
relativistic corrections of order 0(Mf,v 6 ), mean 
field improvement with uo computed from the 
mean link in Landau gauge and efficient wave 
function smearing. 

7.1. Technique of NRQCD 

The nonrelativistic Lagrangian is decomposed 
into the kinetic energy operator, 



H 



A( 2 ) 



(30) 



which is of order Mi,v 2 , and relativistic correc- 
tions the importance of which is estimated via 
power counting. We include operators of order 
M b v 4 and M b v 6 [41], 



SH = SH^ + 



(31) 



with 



- Ci 



(A< 2 



8M fc 3 



C3 



8M 2 



a - (A 



x E - E x 



C4 - - - a ■ B 



a 2 AW _ a(A( 2 ))' 
C5 24M b C6 16nM 2 ' (3 ] 



and 



^ (6) =-^{A«,-B} 

~ C8 64^ {A(2) ' CT ' (AXE " E>< A)} 

8M 3 



(33) 



Derivatives and fields with tilde have their lead- 
ing order discretisation errors removed in order to 



correct for O (a 2 M b w 4 ) errors. Following Ref. [40] 
the quark Green's function is calculated from the 
evolution equation 



G(t + l) 



G(0) 



aH n 



2n 

(l-aSH)G(t), 
<5 x ,o • 



u! [ i 



2n 



(34) 
(35) 



The parameter n allows to stabilise the evolution 
in case of small bare quark masses. For the T 
system n = 2 is appropriate. The Lagrangian 
is tadpole improved a fact that may justify a 
tree level matching to QCD, i.e. all the coef- 
ficients Ci are set to one. Note, however, that 
first order perturbative corrections to some inter- 
actions may well be of the same sizes as relativis- 
tic 0(Mf,v e ) corrections. We choose uo to be the 
mean link in Landau gauge as recently there have 
been hints for better scaling properties associated 
with this choice compared to the average plaque- 
tte prescription [42] . We find a two percent differ- 
ence in u between both choices for all n values. 
The tadpole improved chromo-fields then differ 
by about eight percent. Through the whole sim- 
ulation we fix the heavy quark mass to a value 
bMj, = 1.7. Taking advantage of the smallness of 
the bottomonium system, we exploit configura- 
tions more than once by starting the propagator 
evolution both at different spatial source points 
located on one and the same timeslice and on dif- 
ferent timeslices. 

Meson correlation functions are built from 
quark propagators combined with suitable inter- 
polating operators: 



/nmcson 



(*) = 



^Tr[Gt(x,i)rJ sfc) (y-x)G(y,t)] , (36) 
x,y 

where the source smeared propagator G is ob- 
tained by evolving the extended source: 

G(y)=5>(y-x,t)L (sc) (x) . (37) 

X 

We adopt spectroscopic notation and de- 
note (radially excited) spin-parity eigenstates 
by n 2S+1 Lj. The interpolating operator 
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Y(sc/sk)^ = £iq>(sc/sk) ^ x -j con tains a spin matrix 
and a spatial smearing function. The latter is cal- 
culated as the solution of the Schrodinger Equa- 
tion with the Cornell potential for definite radial 
quantum number and angular momentum. Note 
that 'local' P-states are realized through deriva- 
tives acting on the delta function. For the T 
and r]b we calculate a 4 x 4 matrix of correlations 
with sc/sk = 1, 1,2,3 corresponding to a point 
source, the ground state, the first and second ex- 
cited states respectively. For the L = 1 states we 
restrict ourselves to the ground state and the first 
excitation as signals are worse. Gauge configura- 
tions are fixed to Coulomb gauge. 

7.2. Data Analysis 

Figure 18 gives an impression of the signals' 
quality. Concerning the L = states we are 
able to force correlators to stay in the first excita- 
tion for about ten time-steps. Local masses for P 
states are much noisier and drop to the ground 
state without dwelling in an 'excited plateau' 
first. To extract energies we fit several correlators 
simultaneously to a multi-exponential ansatz. We 
find that vector fits to smeared-local propagators, 



E k t 



(38) 



k=l 



are quite stable whereas matrix fits demand for 
higher statistics. Tab. 8 gives a representative 
sample of fits. We determine hyperfinc splittings 
by single exponential fits to the ratio of smeared- 
local propagators thus exploiting the strong cor- 
relation between them. More complicated fit- 
functions confirm the results obtained from the 
single exponential ansatz but do not behave very 
stable. 

7.3. Results 

The lattice scale is taken from the average of 
the 25 - 15 and CG( 3 P) - 15 splittings (see 
Tab. 9). We do not include 0(a 2 ) gluonic cor- 
rections at this stage as their effect is not visi- 
ble with present statistic. For the ratio of split- 
tings, R = (2 3 5i - 1 3 51)/(CG( 3 P) - l 3 5i), we 
obtain a value R = 1.28(10) at k = 0.1575 and 
R = 1.30(9) at k = 0.157 which is consistent with 
the experimental number R = 1.28. Tab. 10 lists 
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Figure 18. Effective masses of smeared-local 
correlators for the T and fit, n sea = 0.1575, 
n CO nfig — 700. Propagator indices label the ra- 
dial quantum number of the smearing function. 
Note that Gu and G31 rise sharply indicating the 
sudden decay of the dominating excited state to 
the ground state. 
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Table 8 



Vector Fits to smeared-local correlators for T and fit, states. Errors are calculated from 200 bootstraps. 







tmax 


aE 1 


aE 2 


aE 3 


X'ldof 


Q 




2 


4 


30 


0.3589(8) 


0.605(6) 




54.5/48 


0.241 




6 


30 


0.3587(8) 


0.599(7) 




46.9/44 


0.355 




8 


30 


0.3586(8) 


0.599(10) 




46.1/40 


0.235 




10 


30 


0.3583(9) 


0.585(15) 




44.1/36 


0.167 


3 


4 


25 


0.3578(9) 


0.589(12) 


0.782(18) 


63.0/54 


0.188 




5 


25 


0.3581(10) 


0.592(13) 


0.762(28) 


61.2/51 


0.155 




6 


25 


0.3577(12) 


0.574(18) 


0.741(32) 


58.2/48 


0.148 




7 


25 


0.3579(11) 


0.581(20) 


0.78(5) 


56.0/45 


0.126 




2 


4 


30 


0.533(5) 


0.728(20) 




48.4/48 


0.457 




5 


30 


0.535(6) 


0.724(15) 




41.2/46 


0.672 




6 


30 


0.536(8) 


0.719(21) 




41.1/44 


0.597 




7 


30 


0.533(10) 


0.713(28) 




40.6/42 


0.534 



our results in lattice units, Figure 19 sketches the 
T spectrum and hyperfine-splittings. 

Table 9 



Lattice Spacings. 



Lattice Spacings 


k=0.157 


k=0.1575 


a - 1 (2 3 S 1 -l 3 S 1 )[GeV] 
a - 1 (l 3 S 1 -lCG( 3 P))[GeV] 


2.34(15) 
2.40(13) 


2.48(14) 
2.50(11) 



Table 10 



Fit results for radial and spin splittings. 



aM' J = 1.7 


k = 0.157 


k = 0.1575 


2 l S Q - 1 


0.245(8) 


0.240(7) 


3 1 So — 1 1 Sq 


0.46(5) 


0.41(4) 


2 3 5i - 1 3 Si 


0.241(13) 


0.226(15) 


3 3 5i - 1 3 Si 


0.41(2) 


0.38(3) 


1 L P! - 1 3 Si 


0.186(8) 


0.176(9) 


2 1 P 1 - 1 1 P 1 


0.23(2) 


0.18(3) 


1 3 S X - 1 


0.0142(2) 


0.0135(2) 


1 6 P 2 - 1 l P 1 


0.0032(11) 


0.0039(5) 


1 L P X - 1 3 Pi 


0.0040(10) 


0.0026(7) 


1 A Pi - 1 3 P 


0.015(4) 


0.013(2) 



7.4. Remarks 

We have presented preliminary results on the 
bottomonium spectrum calculated from NRQCD, 
in a gauge field background with 2 flavours of dy- 
namical Wilson quarks. Radial excitations and 
hypcrfine splittings have been determined for two 
ensembles of configurations corresponding to two 
different sea-quark masses. We observe better 
agreement with experiment as compared to the 
quenched result confirming the results of simula- 
tions with staggered fcrmions. Up to now we are, 
however, not able to give a conclusive statement 
concerning the dependence of energy levels on the 
dynamical quark mass. To decide on this issue 
and to disentangle unquenching effects from rel- 
ativistic corrections we are going (i) to complete 
the analysis with a third K-value (n sea — 0.156), 
(ii) we will carry out a quenched simulation for 
our choice of heavy quark action and (Hi) we will 
exploit T%L configurations at n sea — 0.158. 

SUMMARY 

I have tried to give an impression of the goals 
and achievements of the SESAM and the T%L 
projects, simulations with Wilson fermions at 
intermediate lattice sizes and dynamical quark 
masses. To set a significant landmark we have 
concentrated on one j3 value. 
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Figure 19. T Spectrum and Hyperfine Splittings: Quenched: (3 — 6.0, 16 3 x 32 [40]; Dynamical Staggered: 
(3 = 5.6, am, = 0.01, 16 3 x 32 [43]; Dynamical Wilson: (3 = 5.6, k = 0.157, 0.1575, 16 3 x 32. Errors are purely 
statistical. Note that Davies et al. have updated their quenched simulation meanwhile, see C. Davies, these 
proceedings. 



It would of course be highly desirable to carry 
out a scaling analysis in full QCD, i.e., the ex- 
trapolation to the continuum limit. Such a study, 
making use of improved actions, will be the ad- 
equate computational challenge for computers of 
> CP-PACS performance! 
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